Text Mining Techniques for Building a Biolexicon

نویسنده

  • Sophia Ananiadou
چکیده

My talk will focus on building a biolexicon by leveraging existing bio-resources, combining them within a common, standardized lexical, terminological, conceptual representation framework and employing advanced NL technologies to discover new terms, concepts, relations and linguistic lexical information from text. In particular I will discuss term normalisation techniques, named entity recognition and a smart dictionary look up. This research forms part of the National Centre for Text Mining (www.nactem.ac.uk) and the project BOOTStrep. Proceedings of the Australasian Language Technology Workshop 2007, pages 1-1

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Specialised Verb Lexicon as the Basis of Fact Extraction in the Biomedical Domain

The BioLexicon is a standardised, reusable, lexical and conceptual resource suitable for advanced biomedical text mining. One of the unique features of the BioLexicon is the incorporation of rich syntactic and semantic patterns for a wide range of domain-relevant verbs, which have been acquired semiautomatically from biomedical corpora. Such types of information can be highly beneficial for inf...

متن کامل

A lexicon for biology and bioinformatics: the BOOTStrep experience

This paper describes the design, implementation and population of a lexical resource for biology and bioinformatics (the BioLexicon) developed within an ongoing European project. The aim of this project is text-based knowledge harvesting for support to information extraction and text mining in the biomedical domain. The BioLexicon is a large-scale lexical-terminological resource encoding differ...

متن کامل

ارائه مدلی برای استخراج اطلاعات از مستندات متنی، مبتنی بر متن‌کاوی در حوزه یادگیری الکترونیکی

As computer networks become the backbones of science and economy, enormous quantities documents become available. So, for extracting useful information from textual data, text mining techniques have been used. Text Mining has become an important research area that discoveries unknown information, facts or new hypotheses by automatically extracting information from different written documents. T...

متن کامل

Three BioNLP Tools Powered by a Biological Lexicon

In this paper, we demonstrate three NLP applications of the BioLexicon, which is a lexical resource tailored to the biology domain. The applications consist of a dictionary-based POS tagger, a syntactic parser, and query processing for biomedical information retrieval. Biological terminology is a major barrier to the accurate processing of literature within biology domain. In order to address t...

متن کامل

The Value of an in-Domain Lexicon in genomics QA

This paper demonstrates that a large-scale lexicon tailored for the biology domain is effective in improving question analysis for genomics Question Answering (QA). We use the TREC Genomics Track data to evaluate the performance of different question analysis methods. It is hard to process textual information in biology, especially in molecular biology, due to a huge number of technical terms w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007